home *** CD-ROM | disk | FTP | other *** search
Text File | 1989-11-13 | 66.3 KB | 1,214 lines |
- { MPI - Multiple Precision Positive Integer Arithmetic Package }
- { Copyright 1989, KSU Research Foundation, Manhattan KS 66506 }
- { License granted to copy, but not for sale or profit }
- M U L T I P L E P R E C I S I O N P O S I T I V E
- I N T E G E R A R I T H M E T I C
- by Kenneth Conrow
- CTA, Cardwell Hall
- Kansas State University
- Manhattan, KS 66506
-
-
- This is a multiple precision positive integer (MPI) arithmetic package.
- The two versions supplied handle integers up to 149 or up to 3029
- decimal digits. The package is written in Eric Isaacson's A86 assembler
- for the 8086 series of processors. It runs on the 8086, 80286, and
- 80386 processors and uses word oriented processing, i.e. employs integer
- word (16-bit at a time) arithmetic. While it runs on the 80286 and
- 80386 processors, it uses none of the enhanced commands available on
- those processors, but is written to the common denominator represented
- by the 8086 chip. No use is made of 8087 coprocessor commands, and no
- advantage will accrue to owners of a coprocessor in their use of these
- routines.
-
- The arithmetic operations supported herein are addition, subtraction,
- multiplication, division, left shifting (by 1 to 15 bits), and
- probabilistic primality testing. To allow input and output of large
- numeric values for the package, two conversion routines are provided.
- For input, COND2B converts a decimal numeric string into the internal
- binary notation the package uses. For output, CONB2D converts an
- internal binary number into a decimal numeric string which can be output.
-
- In addition to the assembler routines which are the core of this
- contribution, there are also included a number of Turbo Pascal programs
- which exercise the MPI routines. These are of some interest in their
- own right. Most computer owners will be impressed with the output of
- MERSENNE, which forms and prints the value of the first few Mersenne
- numbers. Only persons with rather specialized interests will be fond of
- prime number generation (P50TO128), or factoring (CFFFCOMB). But the
- main point of the example routines is to illustrate for potential users
- some of the capabilities and techniques for use of high precision
- positive integer arithmetic.
-
- The package is provided in two forms: as the source code and as the
- assembled object code. Most users should find that the object code can
- be linked with object code from other sources for use without
- modification and that possession of the A86 assembler is not required.
- If, for some reason, reassembly is required, the A86 assembler is widely
- available from shareware distribution libraries and bulletin boards.
- Alternatively, the A86 source code could be revised to meet the
- requirements of a different assembler.
-
- In addition to the functional routines, there is also included a Turbo
- Pascal interface which provides a mechanism by which Turbo Pascal 4 may
- be employed to handle your algorithm. The details of linkage to the
- invoked MPI routines are handled by the interface. The flavor of a
- Pascal routine using this package is rather assembler-level, because you
- have to define a series of registers and keep track of them yourself in
- order to successfully use the package. The details of keeping values in
- registers out of the way of subsequent register reuse by the MPI package
- sometimes entail rather low level considerations which must be met from
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 2
-
- Turbo Pascal. Thus, the interface does provide the programmer with
- nominal high level language access to multiple precision arithmetic, but
- that access remains somewhat awkward and detailed.
-
- While it is true that there is no requirement for ownership of A86 or
- D86 to employ this multiple precision positive integer arithmetic
- package, it is also true that there is a very strong need for some kind
- of debugger (e.g. D86, a good one) in order to successfully program and
- debug programs applying it. Opportunities for error are so rife and the
- territory covered in a typical application is so large that it would be
- very difficult to work effectively without a debugger.
-
-
- --Languages available for multiple precision arithmetic
-
- This package was written to do large volume large positive integer
- arithmetic, especially primality testing, on the microcomputer. One
- wants to have a program which runs as efficiently as possible, hence the
- choice of an assembler language.
-
- REXX is available for PCs and does support multiple precision
- arithmetic, but I have not used it. SAVVY does 64 digit mixed number
- arithmetic (which allows primality testing to about 31 decimal digits)
- and I used it extensively prior to development of this package. The
- name (SAVVY.SAM) of the sample input file for PPTEST derives from this
- historical fact. This package is many fold (about 54) faster than SAVVY
- in primality testing on my particular 8086 machine.
-
- Available multiple precision arithmetic programs/systems on IBM
- mainframes are either interpreted (hence relatively slow, e.g. REXX),
- or designed for floating point numbers (e.g. Ehrman's Multiple
- Precision Floating Point Arithmetic, SPLA 360D-40.4.003 and
- Communications of the ACM, 11, 517-520, July 1968).
-
- The trouble with using mixed/real number arithmetic routines for integer
- arithmetic is that determining the integer remainder from a division is
- a multiple step process -- divide, separate the fractional part, and
- multiply it by the divisor. This is wasteful since integer division
- could more efficiently determine the remainder with even less work than
- dividing out the fractional part in full, let alone the extra
- manipulations of fractional part separation and multiplication. This
- mixed number disadvantage is shared by SAVVY and Ehrman's multiple
- precision arithmetic package. In addition, keeping a floating point
- value normalized and alignment of the operands for addition and
- subtraction by shifting of the mantissa when required (as is done in
- Ehrman's package) represents unnecessary overhead when the work is known
- to involve integer values. The internals of SAVVY are hidden so one
- can't say what internal manipulations limit its speed.
-
-
- --Algorithms and conventions employed herein
-
- This package employs the 'classical' multiple precision arithmetic
- algorithms provided by Donald E. Knuth, The Art of Computer Programming,
- Vol. II, Seminumerical Algorithms, Addison-Wesley, Second Edition
- (1981), pp250ff. Although asymptotically better algorithms sometimes
- exist, it seemed unlikely in the range of precision I have been using
- (up to 29 digits, decimal) that they would provide enough extra
- efficiency to justify their implementation. Interestingly, Knuth (p260)
- estimates that division using the classical algorithm is only about 7%
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 3
-
- more time consuming than multiplication. The exponentiation entailed in
- the probabilistic primality test and in POWERMOD in CFFFCOMB is done by
- Algorithm A, p442, Knuth, idem.
-
- Numbers are represented in this package in radix 65536 notation. The
- words appear in the usual order, with coefficients of decreasing powers
- of 65536 appearing from left to right. Thus, multiple-word values are
- simply interpretable as large binary or hexadecimal numbers. Every
- number (except the result of zero divide) bears a leading zero word as a
- guard word, and the offsets used to indicate operands to the routines
- always point to this guard word. In addition to the offsets, a length
- value (in words) is maintained for every numeric value. Thus, both an
- offset value and a length value are typically passed to a routine to
- specify an operand to it, and a routine typically returns a result as an
- (offset,length) pair. When a subtraction or division operation results
- in reduction of the number of significant words, the offset and length
- values are adjusted to update them (without shifting the significant
- words) before the result descriptor is returned.
-
- The use of a guard word whose value is zero in front of the actual value
- of each number was convenient for several reasons. It is consistent
- with the use of zero-origin indexing so natural in assembler level
- programming. The guard word provides a place for expansion in the event
- that addition or leftshifting results in a carry into an additional
- (radix 65536) digit. The q(0) digit may be filled by Knuth's division
- algorithm, so it is handy to have it present.
-
- The addition, subtraction, left shifting, and division routines actively
- replace any values in the guard words of their arguments with zero.
- The results created by all routines bear a leading zero guard word. If
- a user generates his own values for use as arguments to these routines,
- he should take care to provide and zero this leading guard word, for
- safety's sake, even though the package seems to be fairly tolerant if
- the guard word is sometimes non-zero in the arguments passed it.
-
- The number zero is represented by two consecutive words with value of
- zero. The first is the zero-valued guard word and the second is the
- actual zero which is represented. (O.K., quibbler, this is really a
- multiple-precision non-negative integer arithmetic package.)
-
- This package does its work in a series of registers (which are simply
- word sequences reserved for the specific purpose of holding one or
- another of the values to be manipulated). The user will need to provide
- as many registers as are required to hold the set of values employed by
- his algorithm. (MPI has some internal registers of its own, too). As
- supplied in MPIFACE.OBJ, MULTP.OBJ, and PROBP.OBJ the registers are 32
- words long, which allows for 149 digit decimal numbers. There is no
- general need for the user to use such long registers for his own
- variables if he is certain that his values are always shorter.
- (DIVTEST, for example, uses a smaller value for the register length.)
- However, the space provided to COND2B for it to place the binary
- equivalent of its decimal string argument must be 32 (or 630 for the
- long version) words long, because COND2B assumes this length when it
- calculates the position of the right edge of a register at which to
- align the resulting integer. The variant supplied in MPIFACEL.OBJ has a
- register length of 630 words, enough to hold about a 3029 digit decimal
- number.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 4
-
- Changing the value of RLEN in the RLEN EQU statement in the source code
- from 32 to another number will result in the reservation of a different
- number of words for each internal register employed by the package so
- that longer integers can be processed or space can be saved. If this is
- done, then the code will need to be re-assembled using A86; normally,
- when 31 words (149 decimal digits) of precision (1 word goes to the
- guard word) is adequate, there should be no need to reassemble the code.
- The use of such long lengths for the internal registers may seem
- wasteful, but since there are only 5 such registers (really 4, but one
- is double length), this amounts to only 320 bytes for all the internal
- registers in the short versions (3150 bytes in the long version) so the
- space to be conserved by reassembly is minimal.
-
- Any user-defined registers should be word aligned for maintenance of
- efficiency of operand fetching. A measure of the importance of this
- consideration was gained in timing runs on LUCLEHMR, q.v. below, where a
- penalty of 15% was observed when a heavily used register was
- deliberately mis-aligned to a byte boundary on an 80386 based machine.
-
- (For the curious, MPIPR, MPIQO, MPIXP, and MPIY are the names of the
- internal registers. The MPI prefix denotes "Multiple-precision Positive
- Integer". The variable names employed in the source for this package
- are intended to be peculiar enough that they will not interfere with
- variable names selected by a user wishing to include these routines as
- service routines in his own assembler source code (as is done when 'A86
- PPTEST.8 MULTP.8 PROBP.8' is issued, see below)).
-
-
- --Details about routines
-
- A user intending to access the multiple precision routines from Turbo
- Pascal 4 can probably just skim this description of the internal
- details of this package to get a general impression of the function of
- each of the MPI routines and its limitations, before concentrating on
- reading about the use of the Pascal interface module.
-
- The conventions employed herein are quite idiosyncratic, and not at all
- consistent with conventional practice among microcomputer assembler
- programmers. The routines regularly destroy the contents of the data,
- pointer, and index registers, AX,BX,CX,DX,BP,SI,DI, but do not alter the
- values in the segment registers. The SP value is modified implicitly by
- the pushing and popping activity performed in handling the arguments and
- returning the values via the stack. (This method of handling arguments
- and returned values was learned from an exposure to FORTH.) Local
- variables are mostly kept either in the code segment or the data segment;
- almost none are housed on the stack. Since register contents are
- routinely destroyed by these routines, the user must keep local
- variables around the routine invocations by some mechanism other than
- residence in a register. This is done in these routines whenever one of
- them calls another, lower-level, routine by use of local variables in the
- code or data segment.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 5
-
- All arguments to the routines are passed on the stack; most returned
- values are returned on the stack, except for the primality testing
- routine which returns its result in the AL register. Most numeric
- values are passed by reference via an offset value on the stack; offsets
- of numeric values actually point to the leading guard word (whose value
- is conventionally zero). The following table gives the routine names,
- the arguments to be passed, and the results returned. The caret in
- front of the value component of most arguments means the offset of the
- value is passed; the actual lengths (not offsets of the lengths), in
- words, are stacked; for SHIFTLEFT, the value (not the offset of a value)
- on the stack specifies the number of bits of the left shift.
-
-
- ASSEMBLER LEVEL SUBROUTINE ARGUMENT AND RESULT SUMMARY
-
- routine
- name argument list in stack result list in stack
- _______ __________________________ ________________________
-
- ADDER ^ad1,length(ad1),^ad2,length(ad2) ^sum,length(sum)
- CONB2D ^right(decstr),^op,length(op) ^left(decstr),length(decstr)
- COND2B ^left(register),^left(decstr) ^result,length(result)
- DIVIDE ^dvd,length(dvd),^dsr,length(dsr) ^quot,length(quot),^rem,length(rem)
- MULTIPLY ^mp1,length(mp1),^mp2,length(mp2) ^prod,length(prod)
- PROBPRIM ^op,length(op) {'P' or 'C' in AL}
- SHIFTLEFT ^op1,length(op1),2**n ^result,length(result)
- SUBTRACT ^sh1,length(sh1),^sh2,length(sh2) ^diff,length(diff)
-
- All arguments and result lists are listed in order of stacking. Thus,
- ^ad1 is stacked before its length before ^ad2 before its length in the
- invocation of ADDER. ADDER stacks ^sum before it stacks the length of
- the sum in its return of the result. All lengths of numeric values are
- expressed in words (units of 16 bits); all lengths of decimal strings
- are expressed in bytes (characters).
-
- ADDER
-
- The sum from ADDER replaces the value of its first argument. That is,
- the offset value returned from ADDER is either unchanged from that in
- its first argument or points a very few words lower in memory. (Only
- the possibility of growth in length due to the addition makes it
- necessary to return the result pointers on the stack at all). The second
- argument to ADDER is not changed by it, except that the guard word is
- zeroed.
-
- SUBTRACT
-
- The first argument to SUBTRACT must be larger than the second so that
- the result of the subtraction will be positive (remember: this is a
- positive integer arithmetic package). The difference (the second
- argument is subtracted from the first) replaces the first argument, and
- the returned (offset,length) pair reflects any length loss when the two
- arguments are close to one another in value. The second argument to
- SUBTRACT is not changed, except that the guard word is zeroed.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 6
-
- The requirement that the first argument to subtract be larger than the
- second is sometimes awkward. For example, in the calculation of
- v:=a*(u'-u)+v' in CFFFCOMB, see below, it was necessary to proceed as
- in v:=v'+a*u'-a*u. The result v is always positive even though the
- difference u'-u is sometimes negative.
-
- MULTIPLY
-
- MULTIPLY returns its result in a special product register (MPIPR) which
- is double length. MULTIPLY destroys neither of its arguments. Since
- MULTIPLY clears MPIPR as it starts operation, neither of its arguments
- can be in MPIPR (i.e. neither can be the in situ result of a previous
- multiplication). To do a chain of multiplications, you must move the
- earlier result into a work register and then use the work register as
- the argument to the next MULTIPLY invocation (as is done, for example,
- in COND2B (at the assembler level) and in DIVTEST (at the Pascal
- level, see below)).
-
- The MULTIPLY routine requires that its arguments be correctly
- normalized, i.e. that the pointers to its arguments point to a single
- guard word preceding the significant words of the values. This
- requirement arose from a desire to minimize overhead in probablistic
- primality testing; MULTIPLY takes a zero value in word[1] of an operand
- as a true zero without checking the length field, and immediately
- returns a zero result if word[1] of either operand is zero.
-
- DIVIDE
-
- When DIVIDE operates, the quotient is returned as the first returned
- value ((offset,length) pair) in a special quotient register (MPIQO) and
- the remainder is returned as the second returned value ((offset,length)
- pair) in the register the dividend was in. Thus, DIVIDE destroys its
- first argument (the dividend) by leaving the remainder in its place.
- The second argument (the divisor) is net unchanged by DIVIDE, but the
- divisor has sometimes undergone normalization and unnormalization
- (Knuth's words, see his division algorithm) by two left shifts totalling
- 16 bits, followed by a right shift of one word. (The right shift of one
- word is done to avoid the possibility that a divisor might be shifted
- out of its register by repeated consecutive use.) Since the guard word
- has been filled and a second guard word sometimes appropriated in such
- normalization activity, two guard words should be available in front
- of a value which is to be used as divisor in a DIVIDE invocation. This
- observation is especially cogent if a value in a constant table is used
- as a divisor by DIVIDE.
-
- In the event of division by zero, DIVIDE returns the highest possible
- integer in MPIQO. This result is referenced in the result list by both
- the quotient offset and the remainder offset. This value is a string
- of X'FFFF values of length RLEN (32) words (even the guard word has
- X'FFFF in it). Thus, if the user is concerned with the possibility of
- division by zero, he should test the guard word in the quotient or
- remainder and discover the X'FFFF in the event of a zero divide having
- occurred. In the event of division into zero, the quotient and
- remainder are both zero (both returned offsets reference the
- zero-valued dividend). If the dividend is smaller than the divisor, the
- quotient (zero) is created in MPIQO and the dividend is returned as the
- remainder.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 7
-
- A chain of divisions (as in CONB2D, where a series of divisions by 10
- are performed) cannot be performed in situ (because when the quotient in
- the MPIQO register from the previous DIVIDE is passed as dividend to a
- subsequent DIVIDE, new quotient words overwrite the current dividend
- words, still in use, and chaos ensues). It is necessary to move the
- contents of the quotient register into a work register in order to
- perform a subsequent division in a chain of divisions.
-
- Although a chain of either multiply or divide operations are not
- possible without moving an earlier result into a work register, it is
- possible to do a multiply followed by a divide without moving any
- intermediate results to a work register. Since probabilistic primality
- testing involves raising a large number to a large power modulus another
- large number, which involves repeatedly multiplying and then dividing to
- determine the remainder, this capability was a design feature required
- for efficiency. This process uses the product from a MULTIPLY (in
- MPIPR) as the dividend in the DIVIDE. The remainder stays in the MPIPR
- register, from which it must be removed before the next MULTIPLY.
-
- SHIFTLEFT
-
- The register whose value is to be left shifted is indicated as usual as
- an (offset,length) pair in the first and second stacked arguments.
- SHIFTLEFT's stacked third argument is that power of 2 which results in
- the correct amount of left shifting when it is multiplied into
- SHIFTLEFT's other argument. (I.e., to left shift n bits, use a third
- argument whose value is 2**n.) The shifted value is returned in the
- same register that it came in; only the possibility that the length and
- starting position may have changed through overflow requires that the
- result descriptor be returned on the stack. Zero bits come in from the
- right. (Note to Turbo Pascal users: the MPIFACE routine accepts a
- pointer to a word containing 2**n and moves the value 2**n to the stack
- before calling SHIFTLEFT.)
-
- Note that right shifting by n bits can be achieved by left shifting 16-n
- bits followed by a length decrement of 1 word. (The length decrement
- effects the discard of the bits which a right shift would have shifted
- off the edge of the rightmost word. If the SHIFTLEFT routine detects
- overflow, it automatically appropriates a new guard word and increments
- the length value.) Doing repeated right shifts by this device to a
- value in a register is not advisable because it will eventually result
- in the value drifting leftwards out of the register. The fact that a
- right shift can be achieved by a complementary left shift implies that
- the omission of a right shift routine is not a genuine limitation. An
- example of the use of LEFTSHIFT for the purpose of doing a right shift
- appears in the Pascal routine, LUCLEHMR, q.v. below.
-
- The LEFTSHIFT routine is employed in the division routine to normalize
- and unnormalize the divisor, dividend, and remainder. The LEFTSHIFT
- routine achieves its end by a single left to right pass across its
- operand in a 7-command loop (compare the 13-command innermost loop in
- multiply). Hence, multiplication and division by a power of 2 is more
- rapidly achieved by left or right shifts than by multiplication or
- division.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 8
-
- PROBPRIM
-
- The PROBPRIM routine accepts an odd multiple precision integer as
- argument in the usual (offset,length) pair form and returns either a 'P'
- or a 'C' in register AL, according as the number passed it is probably
- Prime or Composite. This method of returning a result is unique in this
- package. PROBPRIM does not destroy its argument, but the value has
- often been manipulated in situ by PROBPRIM. Since PROBPRIM works by
- doing a long sequence of multiplications and divisions, the number whose
- primality is to be tested must not reside in either the MPIPR or the
- MPIQO register, but must instead (if it was produced in either of these)
- be moved to a work register whence it may be passed as an argument. The
- argument for PROBPRIM must be odd -- it does not give reliable results
- for even arguments, which are, after all, always composite. (2 is not a
- large enough number to be an interesting prime.)
-
- COND2B
-
- The decimal string which is passed to COND2B for conversion to internal
- format must be terminated with a blank or a control character (any
- character whose value is less than or equal to X'20). This arises
- effortlessly (in assembler) if the input values are read into a buffer
- from a blank-delimited text file with carriage-return and/or line-feed
- as the record delimiters. A pointer to the leftmost position of the
- numeric string in the buffer can be passed to COND2B and either the
- internal blank in the record or the trailing record terminator then
- serves as the closing delimiter of the decimal string for COND2B. (Note
- to Pascal programmers: Turbo Pascal does not include the record
- delmiters in the string value it reads, so you must append a blank to a
- decimal string before passing it to COND2B.) Leading blanks and
- internal commas are tolerated (ignored) in the decimal string passed as
- an argument to COND2B. The second argument is a pointer to the left
- edge of a register in which the equivalent numeric value is to be
- returned. COND2B assumes the register is RLEN (32, in the usual case)
- words long and returns the binary (hex) value as the result, right
- justified in the register, pointed to by the usual (offset,length) pair
- on the stack.
-
- CONB2D
-
- CONB2D converts an internal multiple precision number into a decimal
- character string. The first argument points to the rightmost position
- for the returned decimal character string; that byte is filled with X'00
- and the positions to its left are filled with the successively more
- significant decimal digits which represent the value in the register
- specified in the (offset,length) pair of the third and fourth arguments
- to CONB2D. The second argument (the length component of the first
- (offset,length) pair) is just a dummy; any value will serve. The result
- is returned as the usual (offset,length) pair. The offset points to the
- first character of a decimal string delimited in a way which will allow
- it to serve in a subsequent invocation of COND2B. The length value
- returned from CONB2D is the string length (in bytes) of the decimal string
- returned, including the terminal delimiter.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 9
-
- CONSTANT VALUES AS ARGUMENTS
-
- Since many MPI routines zero the leading words of their arguments, some
- caution must be observed in using values in constant tables as arguments
- to routines, lest the values of constants in the tables to the left of
- those used as arguments be incorrectly zeroed as side-effects of the
- subroutines. This is especially important in constants used as divisors
- where the normalization and un-normalization of Knuth's division
- algorithm can cause the use of two guard bytes. The cautious approach
- is to move a value from the constant table into a work register and then
- use the work register as the argument to a routine. The more efficient
- approach is simply to provide an adequate number of leading words in the
- constant tables to avoid any bad side effect of this zeroing and
- normalizing activity. This approach was used for the powers of 2 in
- DIVTEST.PAS, for example.
-
-
- --Discussion of probabilistic primality testing
-
- It is the nature of the probabilistic primality test to determine
- absolutely if a number is composite, but only to determine with odds of
- 3:1 that a number is prime. Therefore, if a number is ever found to be
- composite, that result may be depended upon. If a number is said to be
- prime (with only 1 chance in 4 that it isn't so), the action to take is
- to resubmit it to PROBPRIM repeatedly to get a number of independent
- assessments. If, in two successive determinations PROBPRIM says 'P',
- the odds are only 1 in 16 that it is really composite; if three, only 1
- in 64; if four, only 1 in 256; and so on. By doing a large number, m,
- of successive determinations, if all results are 'P', one can be sure of
- the primality of the argument to a vanishingly small probability (1 in
- 4**m) of being incorrect. (The 1 in 4**m chance of error is only what
- is provably true. Actually the 'P' determination is likely much more
- probably correct than that. One very rarely observes PROBPRIM "changing
- its mind" after an initial 'P' is returned. The example routine
- included here (PPTEST.8) does only a single primality test for each
- element of a 15-member set of closely spaced numbers, regardless of
- whether the first determination of each element results in 'P' or 'C'.)
-
- PROBPRIM maintains a random number (2 words long) which it employs as a
- fraction by which to multiply the primality candidate (N) to calculate
- the number (X<N) which is to be raised to the N - 1 power in the
- probabilistic primality algorithm. The next number in the pseudo-random
- sequence is calculated every time PROBPRIM is invoked. This
- pseudo-random number sequence starts with a constant seed (1223312233,
- X'48EA4369) so it traverses the same sequence on every use of the
- package. If you intend to make an independent primality determination
- by rerunning a sequence of invocations, be sure to change things so that
- each primality test becomes associated with a different pseudo-random
- number on each of the independent runs you make. This can easily be
- done by cyclically permuting the records in the file of test cases.
-
- Several features were built into the code of PROBPRIM and the supporting
- routines to increase the efficiency of the primality testing. The
- requirement that its argument be odd allows the PROBPRIM code to start
- right off with the assumption that the rightmost bit is lit and thus
- avoid one cycle of the major loop in that routine. (Even numbers except
- for 2 are not prime -- don't even test them.) This is the first
- efficiency feature used here.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 10
-
- The raising of a randomly selected large number (X<N) to a large power
- (N-1) in the probabilistic primality test entails repeatedly squaring it
- (X**P, P=1,2,4,8,16,... in MPIXP) once for each bit in the binary
- representation of the power (N-1) and then multiplying that result into
- an accumulator (Y in MPIY) if the given bit is lit. (All this is done
- modulo the primality candidate (N in MPIN), which means a division
- follows each multiplication.) Since this amounts to much labor in
- processing each bit it is important to stop as soon as all significant
- bits are processed. A special test is incorporated into the PROBPRIM
- code to detect when the most significant word is in progress and then to
- test when the most significant bit is finished so that the routine can
- stop its major loop promptly at that time. This is the second
- efficiency feature used.
-
- The division routine often determines that the first quotient digit is
- zero in the long division process. When the quotient digit is zero,
- there is little point in going through the lengthy multiplication of the
- divisor by zero and subtraction of the (zero) product words from the
- dividend words. Accordingly, the division routine skips around the
- multiply and subtract loop when the quotient digit is zero. This is a
- third efficiency feature in the code.
-
-
- --Detailed Technical Notes
-
- The .COM and .OBJ files supplied here were produced by the following
- commands:
-
- A86 MULTP.8 TO MULTP.OBJ (RLEN=32)
- A86 MULTP.8 PROBP.8 TO PROBP.OBJ (RLEN=32)
- A86 PPTEST.8 MULTP.8 PROBP.8 TO PPTEST.COM (RLEN=32)
- A86 MPIFACE.8S MULTP.8S PROBP.8S TO MPIFACE.OBJ (RLEN=32 version)
- A86 MPIFACE.8L MULTP.8S PROBP.8S TO MPIFACEL.OBJ (RLEN=630 version)
-
- If reassembly is required, the above commands can be employed again to make
- revised versions. The size of MPIFACE as assembled above is under 5K
- so the space requirements for its use are minimal. MPIFACE and MPIFACEL
- are the interface routines which can be called from Turbo Pascal to
- interface to the Multiple Precision Integer arithmetic routines. MPIFACE
- is the version with ordinary (32 word) precision, and MPIFACEL has
- extraordinary (630 word) precision.
-
- The MULTP routines include the basic operations without probabilistic
- primality testing. PROBP contains the probabilistic primality testing
- routine (which calls the basic ones, so it is assembled with them.)
-
- If the user wants to use only the basic routines without the
- probabilistic primality testing, he links MULTP.OBJ into his other
- routines. If he wants to invoke PROBPRIM as well as the basic routines,
- he links PROBP.OBJ into his other routines. The PPTEST.COM is for
- demonstration purposes, to illustrate how the routines might be employed
- from the assembler level. To include the MPI routines in your own A86
- code, just substitute your source code name for 'PPTEST.8' in the third
- line above to produce a <yourfn>.COM file which includes the MPI
- support.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 11
-
- To use the routines from Turbo Pascal 4, make MPIFACE (or MPIFACEL)
- available to the Turbo compiler by using the compiler directive {$L
- mpiface} (or {$L mpifacel}) and invoking the interface routines as
- demonstrated in the TURBTEST.PAS, DIVTEST.PAS, P50TO128.PAS,
- MERSENNE.PAS, LUCLEHMR.PAS, and CFFFCOMB.PAS programs.
-
-
- --Description of the test and timing results at the assembler level
-
- The PPTEST routine and the files SAVVY.SAM and ANSWER.SAM are provided
- to give a working example, its input, and the expected results. This
- example operates entirely at the A86 language level, without Pascal.
- The decimal number in each record of SAVVY.SAM (the input file) is a
- homologue of 67. The PPTEST program tests the homologues of each of the
- fifteen primes from 11 ... 67 for primality and produces a 'PC' string
- indicating which members are probably prime and which are composite.
- The individual primality candidates are obtained by subtracting from the
- input number {56, 54, 50, 48, ... 8, 6, 0} (these are the differences
- between 67 and each of the fifteen primes in the 11 ... 67 range) in
- turn to determine the values of the homologues of those primes. Thus,
- the test file provides 300 examples (20*15) of the use of the PROBPRIM
- routine with 28 and 29 digit numbers. The demo set is highly selected
- to contain cases which give relatively frequent primes --
- run-of-the-orchard homologues of 67 are far less likely to contain
- primes than this sample set. For one thing, none of the 15 members of
- each of the homologous sets has a divisor less than 50,000; even among
- homologous sets relatively prime through 50,000, this is a selected
- subset particularly rich in primes.
-
- The following timings are all run on the example SAVVY.SAM herein.
-
- PPTEST runs in about 280 seconds on an 8086 machine running at 8MHz.
- PPTEST runs in about 59 seconds on an 80286 machine running at 12MHz.
- PPTEST runs in about 26 seconds on an 80386 machine running at 20MHz.
-
- It took a SAVVY program 15,100 seconds (yes, 4.2 hrs) to do this work on
- the same 8086 machine at 8MHz. That represents a speed improvement of a
- factor of 54 in the code and an overall performance improvement, code
- and machine, of a factor of 580!
-
-
- --Use of the Turbo Pascal interface routine, MPIFACE
-
- The MPIFACE routine exists for the sole purpose of allowing access to
- the MPI routines from a high level language. If you are working solely
- at the assembler level, you will not need to employ MPIFACE at all. As
- mentioned above, it isn't necessarily easy to use MPI even from Pascal,
- but at least the mechanism is there and the examples should be consulted
- for details and ideas about how to get various jobs done. The examples
- contain some variety of viewpoint and sophistication.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 12
-
- Since the user must define registers to contain his own values and must
- invoke the routines of the multiple precision package in the appropriate
- sequence to accomplish his ends, the general feeling of a Pascal program
- which does multiple precision arithmetic is that of an assembler
- language program. Thus, the user programs at an operation by operation
- level, is concerned with register assignments, with the possible loss of
- values in registers through side effects of a called routine, with
- maintenance of values across routine calls where values are destroyed by
- a routine's action, and (sometimes) with the "hardware" details
- associated with production or maintenance of values right justified in
- the integer registers.
-
- The basic mechanism for passing arguments from Pascal to -- and
- receiving values back from -- the multiple precision routines via the
- MPIFACE interface routine is to employ a control record, whose fields
- are filled with the values describing the operation to perform and its
- operands, and in whose fields are found the descriptors of the returned
- results.
-
- The declaration of this control record appears in the *.PAS examples.
- The details of this record need not be of concern because it is easily
- filled by use of the MAKECONT routine, below, whose arguments provide
- the values to be placed in the control record. Suffice it to say that
- the control record contains the operation to be performed, two pointer
- values and two length values, which together suffice to specify the
- arguments and results for any of the MPI routines.
-
- The MPIFACE module moves/converts the information in the control record
- into the stacked argument list used by the multiple precision package,
- makes the indicated call, and then moves/converts descriptors of the
- returned values from the stack into the control structure. Only the
- operand fields required for the specified routine are referenced, and
- only those required to describe the value returned by the invoked
- routine are replaced upon return, but dummy pointer values are often
- arbitrarily changed in the control structure (even though not used)
- because a global approach was taken to the problem of pointer to offset
- interconversion.
-
- The following table summarizes the effect of each routine on the
- contents of the registers/strings employed as arguments. An entry 'r'
- in column 3 indicates a returned value replaces the argument; an entry
- 'd' in column 3 indicates an argument is destroyed; an entry 's' in
- column 3 indicates an argument survives the routine's action without
- change; an additional entry 'm' indicates that the argument has been
- modified during the routine but restored so that it survived unchanged
- overall; an additional entry 'g' indicates that the guard word is actively
- zeroed or that the argument is (sometimes) padded on the left with one
- or two zero guard words as a side effect of the action of the routine.
- The two conversion routines return a length word in addition to the
- location word among the arguments.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 13
-
- SUMMARY OF ARGUMENTS AND SIDE EFFECTS FOR PASCAL PROGRAMMERS
-
- effect on other arg
- action arguments in control record argument return must
- ABBRev'n ('?' means dummy) values medium avoid
- ________ ________________________________ _________ ________ _____
-
- ADDEr ^ad1,^ad2,length(ad1),length(ad2) rg,sg,r,s
- CONB2d ^right(decstr),^op,?,length(op) r,d,-,r MPIPR
- COND2b ^left(register),^left(decstr),?,? r,s,r,-
- DIVIde ^dvd,^dsr,length(dvd),length(dsr) r,smg,r,smg (MPIQO) MPIQO
- MULTiply ^mp1,^mp2,length(mp1),length(mp2) s,s,s,s (MPIPR) MPIPR
- PROBprim ^op,?,length(op),? sm,-,sm,- (AL) MPIPR,MPIQO
- SHIFtleft ^op1,^2**n,length(op1),? rg,s,rg,-
- SUBTract ^sh1,^sh2,length(sh1),length(sh2) rg,sg,r,s
-
- The carets (^) in this table represent pointers, not offsets (unlike the
- previous similar table).
-
- In the view of the MPI routines, all variables are addressable from the
- CS register value (.COM routine style). Pascal is using various
- registers for addressing data, in various situations. The interface
- makes appropriate offset changes coming and going to accommodate these
- differing views to one another. (In brief summary, MPIFACE keeps the
- offset differences and corrects the offsets back to Pascal's view before
- return.) To allow this accommodation of register usages, registers
- passed to the MPI routines must all occur within a single 64K segment so
- that the largest offset from the CS register will never exceed X'FFFF.
- That can limit the total size of the invoking Pascal routine, the
- interface, and the multiple precision routines. CFFFCOMB, the largest
- program yet attempted, had to be carefully tuned to achieve a MAXM value
- of 114 by making all the registers global, and collecting all variables
- to be passed to MPIFACE together at the front of the main routine.
-
- When a value from a register must be kept for future use because it is
- about to be destroyed as a side effect of some operation, both the value
- in the register and its (pointer,length) descriptor must be kept. There
- are two basic ways of handling this. One way is to copy the value and
- keep its descriptor and later move the value back into its original
- location before use. In this case, the descriptor can be used just as
- it was. The other way is to use the value in the register it was moved
- into. In this case, the values in the descriptor must be reassigned so
- that it represents the position of the value in the new register, not
- the position in the original register from which it was saved. The
- first method involves two moves, but no descriptor change. The second
- method involves one move and one descriptor change. There is a compact
- example of the second method among the examples at the end of this
- document.
-
- Moving a value can itself be done in either of two ways: either the
- whole register can be moved, regardless of the size of the portion of
- it that is filled, or just the filled portion can be moved. These
- options appear (and are commented) among the examples of PASCAL code at
- the end of this document.
-
- These complications are illustrated in the various *.PAS example
- programs herein. A few key cases are excerpted near the end of this
- document. By studying the examples, it will be possible to gradually
- develop a full understanding of all these details.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 14
-
- Since the PROBPRIM routine returns its result by a unique mechanism, it
- is invoked by PPTFACE(<ctrlptr>) and the result is assigned into a
- character. The usual invocation uses MPIFACE(<ctrlptr>) and the result
- is assigned into a ctrlptr. The entry points are synonymous and, except
- for this name change, which serves to fool the Pascal typing
- requirements, preparation for invocation of the primality test is
- identical with that of the other routines.
-
- To bring its parameter-passing convention into conformity with the
- practice in the rest of the package, a pointer to a power of two is
- employed as argument in calling the routine SHIFTLEFT. The interface
- accepts the pointer and uses it to get the value of 2**n which it stacks
- to pass to the routine.
-
-
- --Description of tests and timing results
-
- A number of test programs (demonstrations) using Turbo Pascal programs
- as drivers of the multiple precision subroutines are included. These
- programs did yeoman duty in revealing bugs and suggesting revisions in
- the interface and in the MPI routines to improve their overall accuracy
- and utility. The programs in this category are TURBTEST, DIVTEST,
- P50TO128, MERSENNE, LUCLEHMR and CFFFCOMB.
-
- While doing these programs was exhausting, it is probably not true that
- they provided an exhaustive test of the routines, and users are
- cautioned to test their use thoroughly in their own applications before
- completely trusting them. I would be grateful to hear of any bugs which
- you may detect.
-
- TURBTEST just raises 111111111 to the fourth power by successive
- squarings, tests the result for primality (guess what?--it is
- composite), and outputs the value of the fourth power and the discovery
- that it is composite. The program illustrates (very basic, primitive)
- techniques for copying the values of registers which are about to be
- destroyed as a side effect of the MPI routine to be called and provides
- examples of control structure and pointer manipulation in this
- connection. The elements of the control structure are individually
- manipulated in inline code to establish the correct environment for an
- MPI call in this example. Note the Pascal declarations of the MPI
- interface routine under two pseudonyms and the {$I+ directive} which
- includes the interface code and MPI routines. TURBTEST runs so rapidly
- that timing is hardly a question.
-
- DIVTEST is an implementation of a single example of the division
- algorithm test suggested by Knuth (p.261). It consists of forming
- the product of 2**48-1 and 2**80-1 and then dividing that product by
- one of its two factors. The form of the product puts the division
- routine through its less frequently utilized branches. The fact that
- the other factor is produced as the quotient verifies the correct
- operation of the division routine in this probing instance.
-
- Note that the code in DIVTEST is considerably simplified by the fact
- that the lengths of the values are throughout handled in the
- programmer's head, rather than by saving and using length values
- generated during the running of the program. It will be rare programs
- indeed in which this degree of predictability will permit this stratagem
- to be employed. DIVTEST runs so rapidly that timing information is
- irrelevant.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 15
-
- DIVTEST contains two internal procedures which may help to simplify
- (they certainly shorten) the Pascal programs which invoke MPI. The
- procedure MAKECONT fills in the fields of the control record preparatory
- to an invocation of MPI through the interface. MAKECONT takes an
- argument list of six items, the pointer to the control record to use,
- the pointers to the two argument values, the lengths of the two
- arguments, and the 4 character operation to be performed, and
- distributes them properly in the indicated control record. By using
- this procedure a great deal of the diffuseness of the code evident in
- TURBTEST is avoided, and the source code is much tighter (and more
- readable?). If an operation does not require a full panoply of arguments,
- the slots in the argument list of MAKECONT are just filled in with
- dummies. These have been denoted with '?' in the above table. The use
- of MAKECONT does not afford opportunity to achieve (admittedly very
- minor) efficiencies by reuse of values already in control structures
- without respecification, since MAKECONT requires all its arguments and
- fills them in regardless of their prior presence or status as a dummy.
-
- The procedure MOVEIT moves a value from an MPI register into a
- user-defined register when it is required to retain a value about to be
- destroyed as an MPI side-effect. It can also be used, of course, to do
- a simple value-transferring assignment. MOVEIT moves only the guard
- word and significant words of the value in a register. MOVEIT takes
- three arguments, the pointer to the (zeroth word of the) register the
- value is to be moved into, the pointer to the value to be moved, and the
- length of the value to be moved. MOVEIT proceeds to move the value as
- requested. Retention and/or revision of pointer information, which
- usually is required whenever MOVEIT is used, must be done independently
- of MOVEIT. MOVEIT assumes that REGMAX defines all register lengths.
- Its value will be either 31 or 629, according as MPIFACE or MPIFACEL is
- in use. Note that REGMAX is one less than RLEN, because registers in
- PASCAL have subscripts [0..REGMAX], while registers in A86 are RLEN
- words long, with zero-origin subscripting understood.
-
- Here is the code for the two internal procedures which reduce the volume
- of PASCAL code in programs using MPI; MAKECONT makes a control record;
- MOVEIT moves a value from one MPI register to another.
-
- procedure makecont(cptr:ctrlptr; a1,a2:regptrtyp; l1,l2:word; op:c4);
- {arg 1 points to ctrl structure, 2&3 point to values, 4&5 give the
- lengths of the values, and 6 the operation to be performed}
- {this routine simply stuffs the control structure specified in arg 1}
- begin
- cptr^.ptr1 := a1;
- cptr^.ptr2 := a2;
- cptr^.ln1 := l1;
- cptr^.ln2 := l2;
- cptr^.operation := op;
- end;
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 16
-
- procedure moveit(into,from:regptrtyp; l:word);
- {arg 1 points to the into register at its zeroth word, arg 2 points to
- the value in the from register at its guard word, arg 3 gives the
- length of the value in the from register}
- {this routine moves a value in a register to another location}
- {control information about the value must be handled separately}
- var i:byte;
- begin
- for i := 0 to l do
- into^[regmax-l+i] := from^[i];
- end;
-
- The program P50TO128 merely identifies the prime numbers, as determined
- from three probabilistic primality tests, between 50,000 and 128K.
- Primes in other numeric ranges could easily be produced by simple
- modification of this routine. The program is quite unexceptional, and
- provides a simple example of a Pascal program which employs the MPI
- routines. It runs in about 42 minutes on the 8086 machine at 8MHz.
-
- The program MERSENNE looks at the Mersenne numbers, 2**n - 1, where n is
- prime, and determines if they are prime or not by use of PROBPRIM.
- MERSENNE runs through the odd integers from 3 to nearly 500, determining
- whether each odd integer is prime or composite and, for those found to
- be prime, calculating 2**n - 1 and determining whether that (Mersenne
- number) is prime or composite. An appropriate output line is produced
- for each odd integer. The Mersenne numbers are output in decimal
- notation, thus exercising CONDB2D. The program shuts down when it
- determines that the Mersenne number would be too large to fit in the
- default register size of 32 words (about 149 decimal digits). The
- MERSENNE program could be revised with increased values of RLEN in the
- A86 routines and REGMAX in the Pascal routine, but this has not been
- done, since the run times would become excessive and the probabilistic
- primality test is not the test customarily employed to determine
- primality of Mersenne numbers.
-
- The MERSENNE program provides an example of direct generation of
- multiple precision integers in a register at the Pascal level by taking
- advantage of the fact that 2**n is just a one bit followed by n zero
- bits. Subtraction of 1 then produces the Mersenne number. The program
- employs three probabilistic primality tests for each value of n to
- achieve odds of 64:1 that n is really prime, but only a single primality
- test is used for each 2**n -1. This combination gets the right answers,
- but a few values of n between 3 and 500 give incorrect answers in the
- absence of the triple application of the probabilistic primality test.
-
- MERSENNE runs in 42 minutes and 10 seconds on an 8086 machine at 8mHz.
- 37 42 (89.40%)
- MERSENNE runs in 9 minutes and 48 seconds on an 80286 machine at 12mHz.
- 7 39 (78.06%)
- MERSENNE runs in 4 minutes and 35 seconds on an 80386 machine at 20mHz.
- 3 24 (74.18%)
-
- The second set of numbers were obtained after a program improvement
- which consisted of suppressing the unnecessary zeroing of leading words
- of the shorter of the two arguments to ADDER and SUBTRACT.
- Interestingly, the faster machine benefitted to a greater degree from
- this change than the slower ones.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 17
-
- The LUCLEHMR program implements the Lucas-Lehmer test of primality of
- Mersenne numbers. (See Knuth, ob cit, Theorem L, p 391.) This program
- has been created using the high precision version of MPI, MPIFACEL.OBJ.
- The run time would be excessive if the whole range of numbers from 3 to
- 10000+ were processed so the program has been arranged to skip over long
- sequences of values of n which are known not to give prime Mersenne
- numbers. (The run time is still excessive, and no one will wish to run
- this entire program.) This program starts like MERSENNE in determining
- which odd numbers are primes, but then determines the primality of 2**n
- - 1 and outputs the value of each prime n in an appropriate column in
- accord with the result of the Lucas-Lehmer procedure. There is no
- production of the value of the Mersenne number either in the machine or
- for output. The correct answers, i.e. those values of n for which 2**n
- - 1 is prime, are 2, 3, 5, 7, 13 ,17, 19, 31, 61, 89 ,107, 127, 521,
- 607, 1279, 2203, 2281, 3217, 4253, 9689, 9941, .... .
-
- The notable feature of the code is the gyration to achieve the
- separation of L**2 - 2 into a least significant half of 2**q - 1 bits
- and a most significant half of the same length to add together (remember
- casting out nines?) to form (L**2 - 1) mod (2**q - 1). This is
- complicated by the necessity to maintain correct argument lengths and
- offsets. It may be high level language code, but that does not make it
- readable!
-
- Tests of the effect on the running time of the alignment of the MPIPR
- register in memory were made using minor variations of the LUCLEHMR
- program, obtained by shifting some of the 'byte' variables into the list
- of 'word' variables at the PASCAL source code level. (I was unable to
- discover a more rational way of obtaining program versions with specific
- memory alignments in the linked-in assembler portions.) The D86
- debugger was then used to determine the alignment of MPIPR and timing
- runs made both with a byte-aligned variant and a word-aligned variant.
- It turned out that a performance penalty of 15.5% results from
- byte-aligning instead of word-aligning the MPIPR register, which plays
- such an active role in LUCLEHMR.
-
- The CFFFCOMB (Continued Fraction Factor, Find COMBination) program
- implements Knuth's continued fratcion algorithm (ibid, p381, algorithm
- E) and his linear combination algorithm (ibid, answer to problem 12,
- p610). It will usually result in discovery of a factoring. The
- declaration of a large number of registers to hold the large set of
- variables in that algorithm and a nearly slavish following of the algorithm
- have led to a fairly straight-forward transcription of the algorithm at
- the Turbo Pascal level. This routine seems the best to look at for an
- illusion of distance from the assembler level.
-
- A 'friendly' interface to CFFFCOMB is provided in FACTOR.BAT. Use it to
- get your feet wet or for casual applications. For production factoring,
- you'll want to manipulate the input and output files for yourself, which
- will require learning the information in the following paragraphs.
-
- To use CFFFCOMB to discover the factors of composite numbers in the
- general range of up to 29 decimal digits, put the numbers to be factored
- in NSIN.FIL, one to a line. Use PRIMES.FIL as a second input file.
- CFFFCOMB selects a value of k to avoid the occasional recalcitrant case
- which arises if a particular value of k is used regularly. {Because k
- is selected on the basis of ten useable primes, be sure to have enough
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 18
-
- primes in PRIMES.FIL so that at least ten primes get included.} CFFFCOMB
- selects those primes which it can use in factoring each different
- composite and uses not more than 114 of them. Some experimentation may
- be required to determine the optimal number of primes to load into
- PRIMES.FIL (from PRIMES29.FIL) for any given numeric range of problem,
- but the choice is not usually very critical. CFFFCOMB puts its results
- into a file called OUTFILE.FIL.
-
- When CFFFCOMB runs, the first screen output gives the list of primes
- which will be used in seeking a factor, the value of k used, and the
- number, m, of primes in the list. Second and subsequent lines to the
- screen give the value of x (in the form of a base 65536 number), the
- string of exponents, e[i], which satisfy Knuth's equation (18), and a
- parenthesized count of the number of such discoveries made and
- accumulated. The linear combination portion of the program watches for
- a combination even in the e's which produces a factoring (or a 'glitch'
- message if x=y or x+y=n for a particular combination). A final line
- gives the factors discovered, and the factorization is also output to
- OUTFILE.FIL. The factoring will be found before the number of
- accumulated instances of satisfaction of (18) exceeds m+2.
-
- The trade-off in use of CFFFCOMB is as follows. The more primes that
- are used as the basis, the more frequently equation (18) is satisfied.
- But at the same time, more divisions will be required to find each
- instance and more instances will be required to find an even linear
- combination of the exponents. If too few primes are used (for the size
- of the number being factored), it will be a long time between
- discoveries of instances of satisfaction of (18), and factoring will
- take a very long time. If outputs to the screen are extremely rapid,
- you probably have more than the needed number of primes in PRIMES.FIL,
- and cutting that number down may speed up the process. The trade-off
- does not seem to be very critical, and the process is usually marginally
- faster if more primes are used.
-
- CFFFCOMB is an example of a program whose capacity is restricted by a
- 64K size limit. The number of primes used in the continued fraction
- stage is limited. By rearranging the order of declaration of the
- variables, it proved possible to make all variables which MPI must
- access fall within its 64K addressing range, and an array size of 114
- could be specified. At this point, the variables (to which e, used in
- accumulating the even linear combination of exponents, is the largest
- contributor) are just short of exceeding the 64K size limitation of the
- data segment of a Turbo Pascal 4 program. This limits the size of
- number which can be factored to something a little over 29 digits.
-
- The exponents accumulated in seeking an even linear combination of
- exponents sometimes exceeded the capacity of a longint (about 2E9),
- overflow occured, and the value 1 was reported as a factor during early
- trials. When the linear combination algorithm was altered so that the
- primes are scanned from highest to smallest, this behavior stopped.
- Thus the maximum 114 primes can be used routinely, at the possible cost
- of unecessary time expenditure. If 1 is reported as a factor due to
- this overflow, reduce the number of primes supplied in PRIMES.FIL so
- that fewer than 114 primes are in the basis, let the computer run
- longer, and the resulting less severe accumulation of the exponents may
- avoid the overflow.
-
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 19
-
- Sometimes the continued fraction portion of the program cycles
- prematurely. This behavior seems to be associated with an unfortunate
- choice of k. It can usually be avoided by moving the block of primes
- from 3 through 31 from its normal place in PRIMES.FIL to the end of
- PRIMES.FIL. This alters the set of primes on whose basis k is selected,
- thus alters k, a different continued fraction itinerary is followed, and
- the premature cycling is avoided. The same remedy is effective if so
- many 'x=y' or 'x+y=n' glitches arise that the program quits without
- finding a factor, presumably because the accumulation of the exponents
- makes an independent and more felicitous start in the altered rerun.
-
- CFFFCOMB performs factoring of numbers up to 29 digits in a truly
- impressive manner. It factored the 15 members of a pentadecet site, all
- of whose members were composite and with no factor smaller than 50,000,
- in an average of just under 20 minutes each on the 386 machine at 20
- MHz. This is the NSIN29 example.
-
- --Example fragments illustrating features of Pascal programs using MPI
-
- { these are the PASCAL declarations which support the A86 subroutines, and
- the compiler directive which includes the MPI package }
- {$L mpiface}
- constant regmax = 31;
- type
- c4 = packed array[1..4] of char;
- register = array[0..regmax] of word;
- ctrlptr = ^ctrl;
- regptrtyp = ^register;
- ctrl = record
- operation: c4; {routine name abbrev.}
- ptr1,ptr2: regptrtyp; {argument locations}
- ln1,ln2: word; {argument lengths}
- end;
- function mpiface(a: ctrlptr): ctrlptr; external;
- function pptface(a: ctrlptr): char; external;
-
- *******************************************************************************
-
- { here's an illustration of prefixing 'constant' values with zero
- values as guard words by interspersing the values with zeroes}
- var zone,one,ztwo3,two3,ztwo5,two5,ztwo15,two15 : word;
-
- {we easily achieve the leading zero-valued guard word requirement :}
- one := 1; two3 := 8; two5 := 32; two15 := 32768;
- zone := 0; ztwo3 := 0; ztwo5 := 0; ztwo15 := 0;
-
- *******************************************************************************
-
- { here's a fragment of a chain mutliply from DIVTEST; first establish the
- address that the control structure resides at; then fill the control
- structure to square 2**15; then invoke mpiface; then move the copy
- to an auxiliary register to get ready for a chain multiply }
-
- argsptr := addr(argsdescr);
- {note how we address the guardwords, not the values themselves}
- makecont(argsptr,addr(ztwo15),addr(ztwo15),1,1,'MULT');
- argsptr := mpiface(argsptr);
- {we have to make a copy of this product to chain multiply it}
- moveit(addr(two30),argsptr^.ptr1,argsptr^.ln1);
- {now we probably need a new descriptor for that returned, moved value}
- two30ln := argsptr^.ln1; two30pt := addr(two30[regmax - two30ln)];
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 20
-
- *******************************************************************************
-
- { here's a piece from DIVTEST; it makes a copy of a register in store because
- if left in MPIPR it would be destroyed; it clears the output string; it
- creates the control structure in argsptr; it invokes mpiface to do CONB;
- and it finally outputs the decimal string }
-
- {the whole register is moved though under half full}
- store := two48t80m;
- output := ' ';
- {we provide the descriptor components for the value in store on the fly}
- makecont(argsptr,addr(output[40]),addr(store[regmax-8]),1,8,'CONB');
- argsptr := mpiface(argsptr);
- writeln('Current answer is: ',output);
-
- *****************************************************************************
-
- { fragment from MERSENNE which does three prob. prim. tests, setting
- result to 'C' if composite is returned as result at any time }
- { z1 is a zero word stored right before pcan, the variable whose value
- is really to be tested }
-
- result:='P';
- for i:=1 to 3 do begin
- makecont(argsptr,addr(z1),addr(z1),1,1,'PROB');
- porc := pptface(argsptr);
- if porc = 'C' then result := 'C'
- end;
-
- *****************************************************************************
-
- { fragment from program using a vector of control structures; first we
- get a copy of a homologue of 67; then we divide; then see if the
- remainder from division is one of the magic 15 {56,54,...,8,6,0} in
- deadset, which indicates a composite among the pentadecet members; then
- if so we test the primality of the quotient, having moved it into a work
- register, to see whether this was a complete factoring of that
- pentadecet element or not }
-
- moveit(addr(work),candescr[i].ptr1,candescr[i].ln1); {only filled words move}
- makecont(argsptr,addr(work[regmax-candescr[i].ln1]),primedescr.ptr1,
- candescr[i].ln1,primedescr.ln1,'DIVI');
- argsptr:=mpiface(argsptr);
- if (argsdescr.ln2 = 1) and (argsdescr.ptr2^[1]<=56) and
- (argsdescr.ptr2^[1] in deadset) then
- begin
- writeln(primein,' divides ',pelin[i],' R',argsdescr.ptr2^[1]:2);
- moveit(addr(work),argsdescr.ptr1,argsdescr.ln1);
- makecont(argsptr,addr(work[regmax-argsdescr.ln1]),argsdescr.ptr1,
- argsdescr.ln1,argsdescr.ln1,'PROB');
- porc:=pptface(argsptr);
- end;
-
- *******************************************************************************
- MULTIPLE PRECISION INTEGER ARITHMETIC Page 21
-
-
- { compact illustration of production and use of a copy of a value preparatory
- to division which otherwise would have destroyed the value }
- { instead of establishing a separate pointer and length for work,
- values are calculated on the fly }
-
- moveit(addr(work),candescr[i].ptr1,candescr[i].ln1);
- makecont(argsptr,addr(work[regmax-candescr[i].ln1]),primedescr.ptr1,
- candescr[i].ln1,primedescr.ln1,'DIVI');
-
- Advertisement(s)
-
- The source codes for several of the Pascal programs included here were
- developed using my SPA:WN package. SPA:WN is an acronym for Structured
- Programming Automated: Warnier Notation. The advantage of SPA:WN is
- that it allows design and debugging of a program (in any language, but
- Pascal here) with automated presentation of the program state at any
- time in the form of a Warnier diagram and automated production of target
- language source code at any time. The structured nature of a program
- developed in this way permits relatively easy revisions, even when they
- are deep-seated, and the constant availability of an up-to-date Warnier
- diagram is invaluable during the detailed debugging process. The *.WAR
- files herein are the source and site of maintenance of the *.PAS and
- *.EXE files with identical filenames. The interested reader may enjoy
- comparing the relatively non-procedural presentation of the code in
- the *.WAR versions with the classical procedural versions in the *.PAS
- files. SPA:WN is available from various shareware distribution centers
- or from me for a self-addressed, stamped mailer, and 5 1/4" diskette.
-
- Any serious use of this package should occasion remission of a suitable
- contribution to the Research Foundation, 146 Durland Hall, Kansas State
- University, Manhattan, KS 66506. $50.00 is a sugested amount.
-
- The names Turbo Pascal, SAVVY, A86, and D86 are trade marks or are
- copyrighted by their respective owners.
-
-
- I would be grateful to hear of any bugs you may discover in use of these
- routines. All comments, corrections, suggestions may be sent to:
-
- Kenneth Conrow
- Manager, Academic User Services
- Computing Center, Cardwell Hall
- Kansas State University
- Manhattan KS, 66506
- Phone: (913) 532-6311
- Bitmail: KCONROW AT KSUVM